Overview

Dataset statistics

Number of variables13
Number of observations506
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory51.5 KiB
Average record size in memory104.3 B

Variable types

Numeric12
Categorical1

Warnings

RAD is highly correlated with TAXHigh correlation
TAX is highly correlated with RADHigh correlation
ZN has 372 (73.5%) zeros Zeros

Reproduction

Analysis started2021-03-09 14:13:06.931164
Analysis finished2021-03-09 14:13:41.145229
Duration34.21 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

CRIM
Real number (ℝ≥0)

Distinct504
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.613523557
Minimum0.00632
Maximum88.9762
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:41.250055image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.00632
5-th percentile0.02791
Q10.082045
median0.25651
Q33.6770825
95-th percentile15.78915
Maximum88.9762
Range88.96988
Interquartile range (IQR)3.5950375

Descriptive statistics

Standard deviation8.601545105
Coefficient of variation (CV)2.380376098
Kurtosis37.13050913
Mean3.613523557
Median Absolute Deviation (MAD)0.22145
Skewness5.223148798
Sum1828.44292
Variance73.9865782
MonotocityNot monotonic
2021-03-09T22:13:41.543319image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.33372
 
0.4%
0.015012
 
0.4%
0.082651
 
0.2%
0.5371
 
0.2%
1.354721
 
0.2%
0.141031
 
0.2%
0.035021
 
0.2%
0.036151
 
0.2%
0.663511
 
0.2%
0.12651
 
0.2%
Other values (494)494
97.6%
ValueCountFrequency (%)
0.006321
0.2%
0.009061
0.2%
0.010961
0.2%
0.013011
0.2%
0.013111
0.2%
ValueCountFrequency (%)
88.97621
0.2%
73.53411
0.2%
67.92081
0.2%
51.13581
0.2%
45.74611
0.2%

ZN
Real number (ℝ≥0)

ZEROS

Distinct26
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.36363636
Minimum0
Maximum100
Zeros372
Zeros (%)73.5%
Memory size4.1 KiB
2021-03-09T22:13:41.741513image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q312.5
95-th percentile80
Maximum100
Range100
Interquartile range (IQR)12.5

Descriptive statistics

Standard deviation23.32245299
Coefficient of variation (CV)2.052375864
Kurtosis4.031510084
Mean11.36363636
Median Absolute Deviation (MAD)0
Skewness2.225666323
Sum5750
Variance543.9368137
MonotocityNot monotonic
2021-03-09T22:13:41.931722image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
0372
73.5%
2021
 
4.2%
8015
 
3.0%
12.510
 
2.0%
2210
 
2.0%
2510
 
2.0%
407
 
1.4%
456
 
1.2%
306
 
1.2%
905
 
1.0%
Other values (16)44
 
8.7%
ValueCountFrequency (%)
0372
73.5%
12.510
 
2.0%
17.51
 
0.2%
181
 
0.2%
2021
 
4.2%
ValueCountFrequency (%)
1001
 
0.2%
954
0.8%
905
1.0%
852
 
0.4%
82.52
 
0.4%

INDUS
Real number (ℝ≥0)

Distinct76
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.13677866
Minimum0.46
Maximum27.74
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:42.207574image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.46
5-th percentile2.18
Q15.19
median9.69
Q318.1
95-th percentile21.89
Maximum27.74
Range27.28
Interquartile range (IQR)12.91

Descriptive statistics

Standard deviation6.860352941
Coefficient of variation (CV)0.6160087358
Kurtosis-1.233539601
Mean11.13677866
Median Absolute Deviation (MAD)6.32
Skewness0.2950215679
Sum5635.21
Variance47.06444247
MonotocityNot monotonic
2021-03-09T22:13:42.411076image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18.1132
26.1%
19.5830
 
5.9%
8.1422
 
4.3%
6.218
 
3.6%
21.8915
 
3.0%
9.912
 
2.4%
3.9712
 
2.4%
8.5611
 
2.2%
10.5911
 
2.2%
5.8610
 
2.0%
Other values (66)233
46.0%
ValueCountFrequency (%)
0.461
0.2%
0.741
0.2%
1.211
0.2%
1.221
0.2%
1.252
0.4%
ValueCountFrequency (%)
27.745
 
1.0%
25.657
 
1.4%
21.8915
 
3.0%
19.5830
 
5.9%
18.1132
26.1%

CHAS
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
0.0
471 
1.0
 
35

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1518
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0
ValueCountFrequency (%)
0.0471
93.1%
1.035
 
6.9%
2021-03-09T22:13:42.774443image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category
2021-03-09T22:13:42.900161image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0471
93.1%
1.035
 
6.9%

Most occurring characters

ValueCountFrequency (%)
0977
64.4%
.506
33.3%
135
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1012
66.7%
Other Punctuation506
33.3%

Most frequent character per category

ValueCountFrequency (%)
0977
96.5%
135
 
3.5%
ValueCountFrequency (%)
.506
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1518
100.0%

Most frequent character per script

ValueCountFrequency (%)
0977
64.4%
.506
33.3%
135
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1518
100.0%

Most frequent character per block

ValueCountFrequency (%)
0977
64.4%
.506
33.3%
135
 
2.3%

NOX
Real number (ℝ≥0)

Distinct81
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5546950593
Minimum0.385
Maximum0.871
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:43.047887image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.385
5-th percentile0.40925
Q10.449
median0.538
Q30.624
95-th percentile0.74
Maximum0.871
Range0.486
Interquartile range (IQR)0.175

Descriptive statistics

Standard deviation0.1158776757
Coefficient of variation (CV)0.2089033853
Kurtosis-0.06466713337
Mean0.5546950593
Median Absolute Deviation (MAD)0.0875
Skewness0.7293079225
Sum280.6757
Variance0.01342763572
MonotocityNot monotonic
2021-03-09T22:13:43.260824image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.53823
 
4.5%
0.71318
 
3.6%
0.43717
 
3.4%
0.87116
 
3.2%
0.48915
 
3.0%
0.62415
 
3.0%
0.69314
 
2.8%
0.60514
 
2.8%
0.7413
 
2.6%
0.54412
 
2.4%
Other values (71)349
69.0%
ValueCountFrequency (%)
0.3851
0.2%
0.3891
0.2%
0.3922
0.4%
0.3941
0.2%
0.3982
0.4%
ValueCountFrequency (%)
0.87116
3.2%
0.778
1.6%
0.7413
2.6%
0.7186
 
1.2%
0.71318
3.6%

RM
Real number (ℝ≥0)

Distinct446
Distinct (%)88.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.284634387
Minimum3.561
Maximum8.78
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:43.474268image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum3.561
5-th percentile5.314
Q15.8855
median6.2085
Q36.6235
95-th percentile7.5875
Maximum8.78
Range5.219
Interquartile range (IQR)0.738

Descriptive statistics

Standard deviation0.7026171434
Coefficient of variation (CV)0.1117992074
Kurtosis1.891500366
Mean6.284634387
Median Absolute Deviation (MAD)0.3455
Skewness0.4036121333
Sum3180.025
Variance0.4936708502
MonotocityNot monotonic
2021-03-09T22:13:43.686399image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.1673
 
0.6%
6.2293
 
0.6%
6.1273
 
0.6%
5.7133
 
0.6%
6.4173
 
0.6%
6.4053
 
0.6%
6.382
 
0.4%
5.3042
 
0.4%
5.9832
 
0.4%
7.1852
 
0.4%
Other values (436)480
94.9%
ValueCountFrequency (%)
3.5611
0.2%
3.8631
0.2%
4.1382
0.4%
4.3681
0.2%
4.5191
0.2%
ValueCountFrequency (%)
8.781
0.2%
8.7251
0.2%
8.7041
0.2%
8.3981
0.2%
8.3751
0.2%

AGE
Real number (ℝ≥0)

Distinct356
Distinct (%)70.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean68.57490119
Minimum2.9
Maximum100
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:43.907638image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum2.9
5-th percentile17.725
Q145.025
median77.5
Q394.075
95-th percentile100
Maximum100
Range97.1
Interquartile range (IQR)49.05

Descriptive statistics

Standard deviation28.14886141
Coefficient of variation (CV)0.410483441
Kurtosis-0.9677155942
Mean68.57490119
Median Absolute Deviation (MAD)19.55
Skewness-0.5989626399
Sum34698.9
Variance792.3583985
MonotocityNot monotonic
2021-03-09T22:13:44.127927image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10043
 
8.5%
964
 
0.8%
98.24
 
0.8%
95.44
 
0.8%
97.94
 
0.8%
87.94
 
0.8%
98.84
 
0.8%
94.13
 
0.6%
883
 
0.6%
21.43
 
0.6%
Other values (346)430
85.0%
ValueCountFrequency (%)
2.91
0.2%
61
0.2%
6.21
0.2%
6.51
0.2%
6.62
0.4%
ValueCountFrequency (%)
10043
8.5%
99.31
 
0.2%
99.11
 
0.2%
98.93
 
0.6%
98.84
 
0.8%

DIS
Real number (ℝ≥0)

Distinct412
Distinct (%)81.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.795042688
Minimum1.1296
Maximum12.1265
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:44.350637image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1.1296
5-th percentile1.461975
Q12.100175
median3.20745
Q35.188425
95-th percentile7.8278
Maximum12.1265
Range10.9969
Interquartile range (IQR)3.08825

Descriptive statistics

Standard deviation2.105710127
Coefficient of variation (CV)0.5548580872
Kurtosis0.4879411222
Mean3.795042688
Median Absolute Deviation (MAD)1.29115
Skewness1.011780579
Sum1920.2916
Variance4.434015137
MonotocityNot monotonic
2021-03-09T22:13:44.551919image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.49525
 
1.0%
5.28734
 
0.8%
5.40074
 
0.8%
5.72094
 
0.8%
6.81474
 
0.8%
3.65193
 
0.6%
7.31723
 
0.6%
5.49173
 
0.6%
7.82783
 
0.6%
5.41593
 
0.6%
Other values (402)470
92.9%
ValueCountFrequency (%)
1.12961
0.2%
1.1371
0.2%
1.16911
0.2%
1.17421
0.2%
1.17811
0.2%
ValueCountFrequency (%)
12.12651
0.2%
10.71032
0.4%
10.58572
0.4%
9.22291
0.2%
9.22032
0.4%

RAD
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.549407115
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:44.729427image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median5
Q324
95-th percentile24
Maximum24
Range23
Interquartile range (IQR)20

Descriptive statistics

Standard deviation8.707259384
Coefficient of variation (CV)0.9118115166
Kurtosis-0.8672319936
Mean9.549407115
Median Absolute Deviation (MAD)2
Skewness1.004814648
Sum4832
Variance75.81636598
MonotocityNot monotonic
2021-03-09T22:13:44.886158image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
24132
26.1%
5115
22.7%
4110
21.7%
338
 
7.5%
626
 
5.1%
824
 
4.7%
224
 
4.7%
120
 
4.0%
717
 
3.4%
ValueCountFrequency (%)
120
 
4.0%
224
 
4.7%
338
 
7.5%
4110
21.7%
5115
22.7%
ValueCountFrequency (%)
24132
26.1%
824
 
4.7%
717
 
3.4%
626
 
5.1%
5115
22.7%

TAX
Real number (ℝ≥0)

HIGH CORRELATION

Distinct66
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean408.2371542
Minimum187
Maximum711
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:45.071518image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum187
5-th percentile222
Q1279
median330
Q3666
95-th percentile666
Maximum711
Range524
Interquartile range (IQR)387

Descriptive statistics

Standard deviation168.5371161
Coefficient of variation (CV)0.4128411987
Kurtosis-1.142407992
Mean408.2371542
Median Absolute Deviation (MAD)73
Skewness0.6699559418
Sum206568
Variance28404.75949
MonotocityNot monotonic
2021-03-09T22:13:45.491881image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
666132
26.1%
30740
 
7.9%
40330
 
5.9%
43715
 
3.0%
30414
 
2.8%
26412
 
2.4%
39812
 
2.4%
27711
 
2.2%
38411
 
2.2%
33010
 
2.0%
Other values (56)219
43.3%
ValueCountFrequency (%)
1871
 
0.2%
1887
1.4%
1938
1.6%
1981
 
0.2%
2165
1.0%
ValueCountFrequency (%)
7115
 
1.0%
666132
26.1%
4691
 
0.2%
43715
 
3.0%
4329
 
1.8%

PTRATIO
Real number (ℝ≥0)

Distinct46
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.4555336
Minimum12.6
Maximum22
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:45.705435image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum12.6
5-th percentile14.7
Q117.4
median19.05
Q320.2
95-th percentile21
Maximum22
Range9.4
Interquartile range (IQR)2.8

Descriptive statistics

Standard deviation2.164945524
Coefficient of variation (CV)0.1173060379
Kurtosis-0.2850913833
Mean18.4555336
Median Absolute Deviation (MAD)1.15
Skewness-0.8023249269
Sum9338.5
Variance4.686989121
MonotocityNot monotonic
2021-03-09T22:13:45.912155image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
20.2140
27.7%
14.734
 
6.7%
2127
 
5.3%
17.823
 
4.5%
19.219
 
3.8%
17.418
 
3.6%
18.617
 
3.4%
19.117
 
3.4%
16.616
 
3.2%
18.416
 
3.2%
Other values (36)179
35.4%
ValueCountFrequency (%)
12.63
 
0.6%
1312
 
2.4%
13.61
 
0.2%
14.41
 
0.2%
14.734
6.7%
ValueCountFrequency (%)
222
 
0.4%
21.215
3.0%
21.11
 
0.2%
2127
5.3%
20.911
2.2%

B
Real number (ℝ≥0)

Distinct357
Distinct (%)70.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean356.6740316
Minimum0.32
Maximum396.9
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:46.130051image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.32
5-th percentile84.59
Q1375.3775
median391.44
Q3396.225
95-th percentile396.9
Maximum396.9
Range396.58
Interquartile range (IQR)20.8475

Descriptive statistics

Standard deviation91.29486438
Coefficient of variation (CV)0.255961624
Kurtosis7.226817549
Mean356.6740316
Median Absolute Deviation (MAD)5.46
Skewness-2.890373712
Sum180477.06
Variance8334.752263
MonotocityNot monotonic
2021-03-09T22:13:46.344354image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
396.9121
 
23.9%
395.243
 
0.6%
393.743
 
0.6%
393.232
 
0.4%
394.722
 
0.4%
396.212
 
0.4%
395.692
 
0.4%
396.062
 
0.4%
395.632
 
0.4%
395.62
 
0.4%
Other values (347)365
72.1%
ValueCountFrequency (%)
0.321
0.2%
2.521
0.2%
2.61
0.2%
3.51
0.2%
3.651
0.2%
ValueCountFrequency (%)
396.9121
23.9%
396.421
 
0.2%
396.331
 
0.2%
396.31
 
0.2%
396.281
 
0.2%

LSTAT
Real number (ℝ≥0)

Distinct455
Distinct (%)89.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.65306324
Minimum1.73
Maximum37.97
Zeros0
Zeros (%)0.0%
Memory size4.1 KiB
2021-03-09T22:13:46.582088image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1.73
5-th percentile3.7075
Q16.95
median11.36
Q316.955
95-th percentile26.8075
Maximum37.97
Range36.24
Interquartile range (IQR)10.005

Descriptive statistics

Standard deviation7.141061511
Coefficient of variation (CV)0.5643741263
Kurtosis0.4932395174
Mean12.65306324
Median Absolute Deviation (MAD)4.795
Skewness0.9064600936
Sum6402.45
Variance50.99475951
MonotocityNot monotonic
2021-03-09T22:13:46.784294image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.13
 
0.6%
6.363
 
0.6%
18.133
 
0.6%
8.053
 
0.6%
7.793
 
0.6%
9.52
 
0.4%
4.592
 
0.4%
3.762
 
0.4%
17.272
 
0.4%
10.112
 
0.4%
Other values (445)481
95.1%
ValueCountFrequency (%)
1.731
0.2%
1.921
0.2%
1.981
0.2%
2.471
0.2%
2.871
0.2%
ValueCountFrequency (%)
37.971
0.2%
36.981
0.2%
34.771
0.2%
34.411
0.2%
34.371
0.2%

Interactions

2021-03-09T22:13:10.675965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:10.894999image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:11.111554image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:11.304096image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:11.491337image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:11.676036image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:11.857833image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:12.063263image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:12.276695image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:12.497478image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:12.718665image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:12.912326image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:13.092881image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:13.281683image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:13.464575image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:13.655420image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:13.883027image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:14.086893image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:14.286077image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:14.563366image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:14.819360image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:15.060485image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:15.278333image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:15.464143image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:15.653546image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:15.838080image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:16.022269image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:16.210297image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:16.418965image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:16.678401image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:17.065950image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:17.277409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:17.547544image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:17.862248image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:18.170212image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:19.060057image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:19.408899image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:19.824325image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:20.117548image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:20.351415image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:20.608712image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:20.864799image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:21.104257image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:21.361862image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:21.574489image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:21.803568image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:22.010203image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:22.218882image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:22.416077image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:22.619738image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:22.804933image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:22.991051image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:23.200823image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:23.385697image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:23.606652image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:23.839536image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:24.039148image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:24.241633image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:24.588408image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:24.906218image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:25.176269image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:25.382783image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:25.609656image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:25.794902image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:25.975867image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:26.192805image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:26.441343image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:26.645645image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:26.860877image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:27.053338image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:27.235528image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:27.454060image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:27.658697image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:27.968790image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:28.202644image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:28.402239image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:28.629760image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:28.834610image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:29.018978image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:29.201946image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:29.387711image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:29.594301image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:29.837364image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:30.054822image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:30.294483image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:30.519253image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:30.706216image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:30.915922image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:31.113302image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:31.296816image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:31.487142image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:31.673200image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:31.862302image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:32.045272image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:32.236519image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:32.429258image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:32.620186image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:32.811231image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:33.037514image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:33.262182image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:33.452082image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:33.648400image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:33.836705image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:34.011143image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:34.225821image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:34.490638image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:34.680195image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:34.865470image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:35.066681image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:35.305279image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:35.530483image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:35.761631image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:35.971844image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:36.176648image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:36.385322image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:36.602800image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:36.821044image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:37.045302image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:37.255215image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:37.636436image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:37.851414image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:38.078462image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:38.279281image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:38.485158image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:38.690343image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:38.895116image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:39.101280image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:39.310358image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:39.514088image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:39.716495image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:39.924560image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2021-03-09T22:13:40.126338image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2021-03-09T22:13:46.990687image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-03-09T22:13:47.350595image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-03-09T22:13:47.715500image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-03-09T22:13:48.074461image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-03-09T22:13:40.522363image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-03-09T22:13:40.970797image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTAT
00.0063218.02.310.00.5386.57565.24.09001.0296.015.3396.904.98
10.027310.07.070.00.4696.42178.94.96712.0242.017.8396.909.14
20.027290.07.070.00.4697.18561.14.96712.0242.017.8392.834.03
30.032370.02.180.00.4586.99845.86.06223.0222.018.7394.632.94
40.069050.02.180.00.4587.14754.26.06223.0222.018.7396.905.33
50.029850.02.180.00.4586.43058.76.06223.0222.018.7394.125.21
60.0882912.57.870.00.5246.01266.65.56055.0311.015.2395.6012.43
70.1445512.57.870.00.5246.17296.15.95055.0311.015.2396.9019.15
80.2112412.57.870.00.5245.631100.06.08215.0311.015.2386.6329.93
90.1700412.57.870.00.5246.00485.96.59215.0311.015.2386.7117.10

Last rows

CRIMZNINDUSCHASNOXRMAGEDISRADTAXPTRATIOBLSTAT
4960.289600.09.690.00.5855.39072.92.79866.0391.019.2396.9021.14
4970.268380.09.690.00.5855.79470.62.89276.0391.019.2396.9014.10
4980.239120.09.690.00.5856.01965.32.40916.0391.019.2396.9012.92
4990.177830.09.690.00.5855.56973.52.39996.0391.019.2395.7715.10
5000.224380.09.690.00.5856.02779.72.49826.0391.019.2396.9014.33
5010.062630.011.930.00.5736.59369.12.47861.0273.021.0391.999.67
5020.045270.011.930.00.5736.12076.72.28751.0273.021.0396.909.08
5030.060760.011.930.00.5736.97691.02.16751.0273.021.0396.905.64
5040.109590.011.930.00.5736.79489.32.38891.0273.021.0393.456.48
5050.047410.011.930.00.5736.03080.82.50501.0273.021.0396.907.88